REMARKS 



Applicant requests reconsideration and allowance. The amendments to the 
claims were made to correct minor grammatical errors and were not made to 
distinguish the claims from the art of record. The following remarks will show that 
the invention is different from and patentable over the art of record that the rejection 
applies to the claims. 

The invention generates synthesized speech that uses fundamental stored 
audio segments that are different from the audio segments used by the art of record. 
The prior art conventionally uses bands of one phone to form allophones; the 
invention uses bands from two adjacent phones to create the allophone and then 
applies one or more rules to smooth out the concatenated audio segments. 

Claim 132 was rejected based on a finding that the reference of Gagnon (US 
5,463,715) shows each of the steps of that claim. That finding is erroneous because 
the reference and the invention use different audio segments to create the 
concatenated sounds. Every sound/phone has three bands: initial, center (solo) and 
final. Gagnon forms allophones by interpolating waveforms containing only one band 
of a phone. In contrast, the invention concatenates two audio segments which contain 
at least three bands of two adjacent phones. 

The above difference follows from the different inventory of audio elements 
selected by Gagnon and by the invention. In Gagnon, each stored audio segment is 
limited to only the initial, center and final waveform of one phone. In contrast, the 
audio segments of the invention include portions of at least two sounds/phones . 

See Gagnon, col. 4, lines 31-49. Gagnon forms the short "a" of the word "cat" 
by selecting an initial waveform (band) from a stored segment corresponding to an 
articulation initial waveform (band) of "a" as found in a "ca" sound, a center (solo) 
waveform of a short "a" and a final waveform (band) of an "at" sound. Gagnon is 
straightforward: each sound/phone is individually created from stored bands of that 
one sound/phone and then the sounds/phones are concatenated together by 
interpolation. 

The invention starts with different building blocks. Instead of storing only 
audio segments of initial, center (solo) and final bands of a sound/phone, the 
invention stores segments that each include one or more bands of two adjacent 
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sounds/phones. When two segments of the invention are concatenated, the result is 
sound that has bands of three sounds/phones. 

See Fig. 2b of the application. Note that the audio segment has bands from 
two different but adjacent sounds. It has the solo (center) and end (final) band of the 
first sound/phone and the start co-articulation band of the second sound/phone. When 
one compares the invention of Fig. 2b to Fig. 2 of Gagnon, one finds that the stored 
audio segment of the invention include the center and final waveforms (bands) of "c" 
and the initial waveform of the short "a." In contrast, Gagnon stores only the bands of 
each individual sounds and does not store audio segments with bands from two 
adjacent sounds. 

The invention provides a set of stored audio segments that have more 
information than the corresponding elements of Gagnon. The invention stores 
portions of the blend of two different but adjacent sounds/phones. In contrast, the 
Gagnon relies upon the conventional use of sounds/phones that are modified by 
glottal, labial or medial consonants. The invention provides sound segments that 
already include the seeds of realistic concatenated sounds because each audio segment 
stored by the invention includes information of at least two sounds/phones. In 
contrast, the Gagnon stored segments are segments of only one sound/phone. 

Another difference between the invention and Gagnon is their respective 
boundaries between stored elements and the instances of concatenation. Gagnon has 
boundaries between every band of each phone and no audio segments of Gagnon 
comprises bands of two phones. In contrast, the invention boundaries may include 
three bands that stretch over two sounds/phones. Most audio segments either begin 
with or end with a solo band. Those that begin with a solo band have the final band of 
the initial sound/phone and terminate with the initial band of the adjacent 
sound/phone. See Fig. 2b which shows SAB 1/ EKB 1/ AKB 2. Those that end with 
the solo articulation band of a sound/phone, begin with an final sound/phone of a 
prior, adjacent sound/phone followed by the initial sound/phone of the solo 
articulation sound/phone. See Fig. 2c which shows EKB 1/ AKB 21 SAB 2. 

A further difference between Gagnon and the invention is the origin of 
adjacent initial and final co-articulation bands. In Gagnon they originate from two 
different elements as each band in Gagnon is its own element. In contrast, the 
invention forms a plurality of its audio segments from bands of adjacent 
sounds/phones. Therefore the total co-articulation band that consists of the initial and 
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final end co-articulation bands, sounds more natural with the invention because the 
elements used to co-articulate originate from the same utterance of a speaker. 

Each independent claim has the same distinguishing limitations discussed 
above. Thus, the claims as presented are patentable over the art of record as applied 
to the claims. A notice of allowance is respectfully requested. 
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